Skeleton Tracking SDKのPython版サンプルのコードを読んでみた[Windows編]

せーの

2020.03.11

この記事は公開されてから1年以上経過しています。情報が古い可能性がありますので、ご注意ください。

せーのでございます。

先日、Intel RealSenseからCubemos製のSkeleton Tracking SDKがリリースされたので触ってみました。

Intel RealSenseに待望のSkeleton Tracking SDKが出た（cubemos製）ので触ってみた[Windows編]

Intel RealSense D435iとSkeleton Tracking SDKを連携させたら簡単だったのでコードを読んでみた[Windows編]

RealSenseカメラと連携するといい感じにストリームにも対応してくれています。

さて、前回まではC#のサンプルを色々触ってきましたが、今回はPython版のサンプルもあったので動かしつつコードを読んでみたいと思います。

インストール

まずSDKをWindowsにインストールします。詳しくは最初の記事に書いてありますので参考にしてみてください。

Pythonですが、ドキュメント上は3.6以上、と書かれているのですが、3.8にて試してみたところ、cypesのCDLL対応がまだできていないようで、サンプルのimport部分でDLLが読み込めない、という悲しい事態に陥りました。all_dll_directory()など色々サンプルコードをいじくるとうまくいきそうなのですが、主題から外れるので今回は3.7で試します。

pipにて必要なモジュールをインストールします。python版のサンプルで使うのはcubemos-core, cubemos-skel, opencv-pythonです。SDKのインストールディレクトリの中にある「wrappers」というフォルダにwheelが入っていて、samplesフォルダの中にもrequirementes.txtという必要なモジュールを列記したファイルがあるので、これらを使ってインストールします。

pip install --find-links="%CUBEMOS_SKEL_SDK%\wrappers\python" cubemos-core cubemos-skel 
cd "%CUBEMOS_SKEL_SDK%\samples\python\" 
pip install -r requirements.txt

これだけです。簡単ですね。

サンプル実行

次にsamplesフォルダの中にあるサンプルを動かしてみます。引数として骨格検知したい画像のパスと、わかりやすいように骨格検知の結果を描画した画像を出力させます。

python estimate-keypoints.py -o C:\tmp\output.jpg ..\res\images\skeleton_estimation.jpg

でてきたoutput.jpgに骨格検知の結果が描画されていれば成功です。

画像が引数として自由に選べるので、好きな画像をセットすればいい感じに骨格検知してくれます。

それではコードを読んでみましょう。

コードを読んでみた

コード全文はこのようになっています。

#!/usr/bin/env python3
from cubemos.core.nativewrapper import CM_TargetComputeDevice
from cubemos.core.nativewrapper import initialise_logging, CM_LogLevel
from cubemos.skeleton_tracking.nativewrapper import Api, SkeletonKeypoints
import cv2
import argparse
import os
import platform
from pprint import pprint

keypoint_ids = [
    (1, 2),
    (1, 5),
    (2, 3),
    (3, 4),
    (5, 6),
    (6, 7),
    (1, 8),
    (8, 9),
    (9, 10),
    (1, 11),
    (11, 12),
    (12, 13),
    (1, 0),
    (0, 14),
    (14, 16),
    (0, 15),
    (15, 17),
]


def default_log_dir():
    if platform.system() == "Windows":
        return os.path.join(os.environ["LOCALAPPDATA"], "Cubemos", "SkeletonTracking", "logs")
    elif platform.system() == "Linux":
        return os.path.join(os.environ["HOME"], ".cubemos", "skeleton_tracking", "logs")
    else:
        raise Exception("{} is not supported".format(platform.system()))


def default_license_dir():
    if platform.system() == "Windows":
        return os.path.join(os.environ["LOCALAPPDATA"], "Cubemos", "SkeletonTracking", "license")
    elif platform.system() == "Linux":
        return os.path.join(os.environ["HOME"], ".cubemos", "skeleton_tracking", "license")
    else:
        raise Exception("{} is not supported".format(platform.system()))


def check_license_and_variables_exist():
    license_path = os.path.join(default_license_dir(), "cubemos_license.json")
    if not os.path.isfile(license_path):
        raise Exception(
            "The license file has not been found at location \"" +
            default_license_dir() + "\". "
            "Please have a look at the Getting Started Guide on how to "
            "use the post-installation script to generate the license file")
    if "CUBEMOS_SKEL_SDK" not in os.environ:
        raise Exception(
            "The environment Variable \"CUBEMOS_SKEL_SDK\" is not set. "
            "Please check the troubleshooting section in the Getting "
            "Started Guide to resolve this issue." 
        )


def get_valid_limbs(keypoint_ids, skeleton, confidence_threshold):
    limbs = [
        (tuple(map(int, skeleton.joints[i])), tuple(map(int, skeleton.joints[v])))
        for (i, v) in keypoint_ids
        if skeleton.confidences[i] >= confidence_threshold
        and skeleton.confidences[v] >= confidence_threshold
    ]
    valid_limbs = [
        limb
        for limb in limbs
        if limb[0][0] >= 0 and limb[0][1] >= 0 and limb[1][0] >= 0 and limb[1][1] >= 0
    ]
    return valid_limbs


def render_result(skeletons, img, confidence_threshold):
    skeleton_color = (100, 254, 213)
    for index, skeleton in enumerate(skeletons):
        limbs = get_valid_limbs(keypoint_ids, skeleton, confidence_threshold)
        for limb in limbs:
            cv2.line(
                img, limb[0], limb[1], skeleton_color, thickness=2, lineType=cv2.LINE_AA
            )


parser = argparse.ArgumentParser(description="Perform keypoing estimation on an image")
parser.add_argument(
    "-c",
    "--confidence_threshold",
    type=float,
    default=0.5,
    help="Minimum confidence (0-1) of displayed joints",
)
parser.add_argument(
    "-v",
    "--verbose",
    action="store_true",
    help="Increase output verbosity by enabling backend logging",
)

parser.add_argument(
    "-o",
    "--output_image",
    type=str,
    help="filename of the output image",
)

parser.add_argument("image", metavar="I", type=str, help="filename of the input image")



# Main content begins
if __name__ == "__main__":
    try:
        #Parse command line arguments
        args = parser.parse_args()
        check_license_and_variables_exist()
        #Get the path of the native libraries and ressource files
        sdk_path = os.environ["CUBEMOS_SKEL_SDK"]
        if args.verbose:
            initialise_logging(sdk_path, CM_LogLevel.CM_LL_DEBUG, True, default_log_dir())

        img = cv2.imread(args.image)
        #initialize the api with a valid license key in default_license_dir()
        api = Api(default_license_dir())
        model_path = os.path.join(
            sdk_path, "models", "skeleton-tracking", "fp32", "skeleton-tracking.cubemos"
        )
        api.load_model(CM_TargetComputeDevice.CM_CPU, model_path)
        #perform inference
        skeletons = api.estimate_keypoints(img, 192)

        # perform inference again to demonstrate tracking functionality.
        # usually you would estimate the keypoints on another image and then
        # update the tracking id
        new_skeletons = api.estimate_keypoints(img, 192)
        new_skeletons = api.update_tracking_id(skeletons, new_skeletons)

        render_result(skeletons, img, args.confidence_threshold)
        print("Detected skeletons: ", len(skeletons))
        if args.verbose:
          print(skeletons)
          
        if args.output_image:
            isSaved = cv2.imwrite(args.output_image, img)
            if isSaved:
                print("The result image is saved in: ", args.output_image)
            else:
                print("Saving the result image failed for the given path: ", args.output_image)


            
    except Exception as ex:
        print("Exception occured: \"{}\"".format(ex))
# Main content ends

C#版では別ファイルにまとめられていた各ディレクトリパスやライセンスのチェックが上部にあって、66行目くらいから本来のメソッドのようです。
ただ、python版では引数の定義が入っているので、ここを押さえておきましょう。

parser = argparse.ArgumentParser(description="Perform keypoing estimation on an image")
parser.add_argument(
    "-c",
    "--confidence_threshold",
    type=float,
    default=0.5,
    help="Minimum confidence (0-1) of displayed joints",
)
parser.add_argument(
    "-v",
    "--verbose",
    action="store_true",
    help="Increase output verbosity by enabling backend logging",
)

parser.add_argument(
    "-o",
    "--output_image",
    type=str,
    help="filename of the output image",
)

parser.add_argument("image", metavar="I", type=str, help="filename of the input image")

それぞれ -c: 信頼性の閾値の設定、-v: コンソールに詳細情報を吐くかどうか、-o: 結果を画像に描画するかどうか、を定義できるようですね。

それでは118行目のメインメソッドの流れに沿ってみていきましょう。

Initialise

        args = parser.parse_args()
        check_license_and_variables_exist()
        sdk_path = os.environ["CUBEMOS_SKEL_SDK"]
        if args.verbose:
            initialise_logging(sdk_path, CM_LogLevel.CM_LL_DEBUG, True, default_log_dir())

        img = cv2.imread(args.image)
        
        api = Api(default_license_dir())

引数を取得したらライセンスを確認し、パスを定義しています。対象画像をopenCVのモジュールcv2で読み込んで、APIを作成しています。ここら辺はC#版と変わらないですね。

estimate

        model_path = os.path.join(
            sdk_path, "models", "skeleton-tracking", "fp32", "skeleton-tracking.cubemos"
        )
        api.load_model(CM_TargetComputeDevice.CM_CPU, model_path)
        #perform inference
        skeletons = api.estimate_keypoints(img, 192)

        # perform inference again to demonstrate tracking functionality.
        # usually you would estimate the keypoints on another image and then
        # update the tracking id
        new_skeletons = api.estimate_keypoints(img, 192)
        new_skeletons = api.update_tracking_id(skeletons, new_skeletons)

モデルのパスを指定したらロードして、対象画像を推定します。追跡機能を確認するためにもう一度推定させているのはサンプルとして非常にありがたいですね。
api.update_tracking_idを叩くことで新たに推定した画像に対しても同じIDが振られる、ということですね。

render

render_result(skeletons, img, args.confidence_threshold)

def render_result(skeletons, img, confidence_threshold):
    skeleton_color = (100, 254, 213)
    for index, skeleton in enumerate(skeletons):
        limbs = get_valid_limbs(keypoint_ids, skeleton, confidence_threshold)
        for limb in limbs:
            cv2.line(
                img, limb[0], limb[1], skeleton_color, thickness=2, lineType=cv2.LINE_AA
            )

def get_valid_limbs(keypoint_ids, skeleton, confidence_threshold):
    limbs = [
        (tuple(map(int, skeleton.joints[i])), tuple(map(int, skeleton.joints[v])))
        for (i, v) in keypoint_ids
        if skeleton.confidences[i] >= confidence_threshold
        and skeleton.confidences[v] >= confidence_threshold
    ]
    valid_limbs = [
        limb
        for limb in limbs
        if limb[0][0] >= 0 and limb[0][1] >= 0 and limb[1][0] >= 0 and limb[1][1] >= 0
    ]
    return valid_limbs

keypoint_ids = [
    (1, 2),
    (1, 5),
    (2, 3),
    (3, 4),
    (5, 6),
    (6, 7),
    (1, 8),
    (8, 9),
    (9, 10),
    (1, 11),
    (11, 12),
    (12, 13),
    (1, 0),
    (0, 14),
    (14, 16),
    (0, 15),
    (15, 17),
]

ここでは推定結果となる骨格の座標をopenCVを使って画像に描画しています。
まずget_valid_limbsという関数で、推定された骨格座標から検知できなかった座標を省いてtupleに格納します。それぞれの結果にはconfidences(信頼度)がついているので、それが閾値を上回ったものだけを配列として入れなおしているんですね。ポイント間を線で結びたいのでそれぞれ2つ分を配列として格納しています。

次にrender_resultの中でその配列をループで回して、必要なポイント間にcv2.lineを使って線を引きます。引きたいポイント間はこのようになりますので